Skip to content

Conversation

Groverkss
Copy link
Member

@Groverkss Groverkss commented Aug 31, 2025

Creates a simple build system which creates a CAPI library and python bindings for lighthouse. Starts with only MLIR Core IR, MLIR Func and exports torch-mlir fx_importer/onnx_importer under lighthouse.extras.

I'm not an expert on CMake so this is just a starting point, someone with more knowledge of CMake can probably do a much better job.

@makslevental
Copy link

Copy link
Contributor

@rolfmorel rolfmorel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Groverkss !

Could you include some build instructions somewhere as well as the most basic of tests to demonstrate what's been built and should now be available for use, e.g. mlir bindings and which torch-mlir libs (from Python).

extern "C" {
#endif

/// Initializes poseidon.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's poseidon?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops haha

# seperation between the compiler and the runtime. The runtime bindings have no
# reason to ship with the mlir python bindings.
# TODO: Add an upstream cmake param for this vs having a global here.
add_compile_definitions("MLIR_PYTHON_PACKAGE_PREFIX=lighthouse.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please just use the mlir. prefix and instead namespace everything specific to lighthouse as mlir.lighthouse? I did it in tpp-mlir, i.e. with the mlir.tpp namespace and downstream dialects just ending up in mlir.dialects: https://github.com/libxsmm/tpp-mlir/blob/main/python/CMakeLists.txt

That every project is renaming the mlir namespace along with its upstream bindings is atrocious, IMHO. (So is the need for downstreams to build the full upstream bindings just to provide minor extensions - To Be Fixed.)

Copy link
Member Author

@Groverkss Groverkss Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would we use the mlir prefix here? MLIR is a toolkit, not a compiler. We are trying to build a compiler. At max, I would say lighthouse.mlir is a better prefix than this. We can't just go and take over mlir's namespace. I disagree with mlir.lighthouse.

That every project is renaming the mlir namespace along with its upstream bindings is atrocious, IMHO.

I think there is good reason to rename it. Let's say you have two projects on different versions of mlir both trying to work on the same namespace. How are you going to gurantee that they work together? That just sounds like a disaster.

This is not an imaginary scenario. Let's say you decide to ship 2 pytorch dynamo implementations, lighthouse-1 and lighthouse-2. They don't depend on each other at all, but you ship them under the namespace mlir.lighthouse1 and mlir.lighthouse2. Now, if your mlir things don't match up, it completly breaks. There is no dependency between these packages, but you still break.

So is the need for downstreams to build the full upstream bindings just to provide minor extensions - To Be Fixed.

I don't know what you are talking about here. Clearly in this pr the only bindings built are for func dialect and mlir core. I don't know what the "full upstream bindings" is. We need all of this.

Comment on lines +37 to +38
extras/fx_importer.py
extras/onnx_importer.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are these files?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That also brings to mind if this these build files have been tested in any sort of way.

Copy link
Member Author

@Groverkss Groverkss Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how the fx_importer works and is meant to work. torch-mlir exposes these bindings and they only depend on mlir core (which you bring from somewhere) and func dialect. When building your own aot.compile, you get a torch.export.ExportedProgram and send it down fx_importer.J

Similar thing for onnx.

The fx_importer sources assume that the mlir ir core stuff is in ..ir : https://github.com/llvm/torch-mlir/blob/main/python/torch_mlir/extras/fx_importer.py#L95

That also brings to mind if this these build files have been tested in any sort of way.

I need to add tests for these things, but i'm currently only working on this on weekends and whatever time i have. We don't have anything concrete setup, so i'm mainly trying to get things in and then build a aot.compile flow on top, and then work on testing. I can add tests for checking if things are correctly improted though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than using submodules, can we instead use files with specified commits?

TPP-MLIR has the following: https://github.com/libxsmm/tpp-mlir/blob/main/build_tools/llvm_version.txt

We have found this pattern to lead to more flexibility. Also with an eye to that we expect people to extend lighthouse in various ways, bringing in their own weird dependencies and how they choose to manage them.

Copy link
Member Author

@Groverkss Groverkss Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than using submodules, can we instead use files with specified commits?

Why? That's just .gitmodules with extra steps.

We have found this pattern to lead to more flexibility. Also with an eye to that we expect people to extend lighthouse in various ways, bringing in their own weird dependencies and how they choose to manage them.

You do an out of build tree to extend it or fork it. I don't see what you mean by bringing own weird dependencies. If you fork, you can add whatever submodules you want. the build system can support BYOLLVM, i just didn't set it up because i want something basic to start.

# If building features that require Python development, find them early in
# one invocation (some CMake versions are sensitive to resolving out of order).
#-------------------------------------------------------------------------------
option(LIGHTHOUSE_BUILD_PYTHON_BINDINGS "Builds the Lighthouse python bindings" ON)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have one "test" file to document what we expect to be build? E.g. just a Python file containing from lighthouse import ir or rather from mlir import ir, lighthouse.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, i need to add a test before i land this, i will do that.

FetchContent_Declare(
nanobind
GIT_REPOSITORY https://github.com/wjakob/nanobind.git
GIT_TAG 0f9ce749b257fdfe701edb3cf6f7027ba029434a # v2.4.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's movement upstream on nanobind versions. I think llvm/llvm-project#161230 is most pertinent. Before landing, let's make sure we are aligned with what upstream is doing/about to do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We land and then integerate the commit and change. This is the correct way of doing it until the pr lands. Once it lands, we can integerate it.

@rolfmorel
Copy link
Contributor

As @makslevental has been making a lot of progress on the Python bindings lately -- thank you!! -- maybe he has better perspective on setting up a new downstream project extending upstream.

E.g., I would like for lighthouse to be an out-of-tree build and rely on the upstream Python bindings for the upstream dialects being built with the rest of MLIR. Any new dialects or C++-MLIR code that needs bindings for lighthouse -- god forbid we need them -- should preferably be a separate small .so that we can load alongside the upstream bindings.

@makslevental , do you have any suggestions?

@makslevental
Copy link

makslevental commented Oct 2, 2025

should preferably be a separate small .so that we can load alongside the upstream bindings.

just so i'm not assuming things, can you give me a user story? like what i think you're saying is you want

  1. pip install upstream-mlir-python-bindings
  2. pip install downstream-mlir-python-bindings
  3. from mlir.dialects import *
    from mlir.ir import Context, Location, Module, InsertionPoint
    from downstream.dialects import my_dialect
    with Context(), Location.unknown():
        m = Module.create()
        with InsertionPoint(m.body):
            op1 = arith.constant(...)
            op2 = my_dialect.my_op(...)
        assert m.operation.verify()

Is that accurate?

@rolfmorel
Copy link
Contributor

rolfmorel commented Oct 2, 2025

should preferably be a separate small .so that we can load alongside the upstream bindings.

just so i'm not assuming things, can you give me a user story? like what i think you're saying is you want

  1. pip install upstream-mlir-python-bindings
  2. pip install downstream-mlir-python-bindings
  3. from mlir.dialects import *
    from mlir.ir import Context, Location, Module, InsertionPoint
    from downstream.dialects import my_dialect
    with Context(), Location.unknown():
        m = Module.create()
        with InsertionPoint(m.body):
            op1 = arith.constant(...)
            op2 = my_dialect.my_op(...)
        assert m.operation.verify()

Is that accurate?

Yes, essentially this! As MLIR is still quite unstable, I am okay for users to need to fiddle to get the exact versions (i.e. commits) of upstream-mlir-python-bindings and downstream-mlir-python-bindings to line up, but beyond that I would, as a user, very much like for what you suggest to Just Work.

As a developer of a non-llvm/llvm-project compiler, e.g. lighthouse, I would like to just take any* build of llvm-project with MLIR and its Python bindings enabled and 1) build a small .so that just adds a couple of dialects and 2) add a python "package", i.e. folder, for the corresponding Python bindings (and just for those dialects).

*: modulo needing make sure the API I am using is correct for the specific commit of llvm-project.

@makslevental
Copy link

Yes, essentially this!

Yea too bad it's impossible currently because of how the symbol resolution happens between the bindings code (the Python C extensions with nanobind code) and the MLIR (C) APIs. I can explain more shortly but I just wanted to quickly say that is indeed impossible currently.

The rest I agree with you on completely - it's what I wanted when I started a couple of years ago.

@makslevental
Copy link

Here's what it would take llvm/llvm-project#161782 (ie hardening that PR).

@Groverkss
Copy link
Member Author

Groverkss commented Oct 3, 2025

E.g., I would like for lighthouse to be an out-of-tree build and rely on the upstream Python bindings for the upstream dialects being built with the rest of MLIR. Any new dialects or C++-MLIR code that needs bindings for lighthouse -- god forbid we need them -- should preferably be a separate small .so that we can load alongside the upstream bindings.

I'm not sure when we decided this is the plan for lighthouse. I disagree with a lot of this.

Any new dialects or C++-MLIR code that needs bindings for lighthouse -- god forbid we need them -- should preferably be a separate small .so that we can load alongside the upstream bindings.

We will absolutely need new dialects. We will absolutely need to expose them as bindings. We cannot rely on upstream binding wheels (which upstream doesn't even ship! maks is kind enough to build and ship them for us).

rely on the upstream Python bindings for the upstream dialects being built with the rest of MLIR

That's a weird way of handling dependencies. Let's say a mlir commit completely breaks lighthouse and we need atleast 1-2 weeks of volunteer time to fix it. We cannot ask upstream to revert the commit, and we also cannot just get stuck in a cycle of not integerating new commits because a commit broke us. Instead, we locally revert it in our fork and keep iterating while someone works on unreverting the local revert we did. If we rely on upstream builds for this, this is never going to work. Also, I'm very scared of how we will handle version mistmatches this way.

My understanding of lighthouse is as a compiler, if we build a compiler we build everything with it and ship it. Does clang / rust support a shipped version where they ask you to download a specific version of llvm and then use clang with it? It just sounds like a version mismatch waiting to happen.

I'm not an expert on this python stuff, I just know the things i'll need to make the export path work. I'm sure @makslevental can correct me and has a better way, but all in all, this is a basic setup for starting out to start building a proper torch.export flow. If we want to improve it, we can land and iterate. But let's not stop because a simple setup isn't the "ideal" setup you want.

@rengolin
Copy link
Member

rengolin commented Oct 3, 2025

E.g., I would like for lighthouse to be an out-of-tree build and rely on the upstream Python bindings for the upstream dialects being built with the rest of MLIR. Any new dialects or C++-MLIR code that needs bindings for lighthouse -- god forbid we need them -- should preferably be a separate small .so that we can load alongside the upstream bindings.

I'm not sure when we decided this is the plan for lighthouse. I disagree with a lot of this.

The original design was for lighthouse to be a facilitator, not a compiler on its own. We're not trying to replicate iree/tpp-mlir here, we're trying to help existing (and new) MLIR based compilers to reuse (or repeat) the infrastructure to use MLIR in a compiler.

If we create a structure here that is not only opinionated (that's fine) but hard-coded to a particular path, we'll make it harder for people to reuse, and we'll limit for repeat (ie. copy and paste). That's not a good design.

The shared object method is one that has been discussed in upstream MLIR for years and in use already, including LLVM, so not a new thing. If there's a better way, I'm all ears.

The main requirement is to be able to test different downstream compilers on the same lighthouse project just by adding components to it (say, a branch or a fork), and not have to change CMake / submodules / glue code at all.

We will absolutely need new dialects. We will absolutely need to expose them as bindings. We cannot rely on upstream binding wheels (which upstream doesn't even ship! maks is kind enough to build and ship them for us).

One of the key design principles in the forum discussions was that we most certainly don't want to have to create new dialects in lighthouse. If we need something from MLIR that it doesn't currently do, we make the case for it to do and work there, not here.

We may need to create opinionated infrastructure and glue code, but such code needs to be as independent as possible (including using shared objects, as implementation details).

That's a weird way of handling dependencies. Let's say a mlir commit completely breaks lighthouse and we need atleast 1-2 weeks of volunteer time to fix it. We cannot ask upstream to revert the commit,

I want to get into a place where we most definitely should ask upstream to fix it, and if not, revert it. This is already the case for the LLVM test-suite, and is one of the key reasons why we created the lighthouse project inside the LLVM organization.

and we also cannot just get stuck in a cycle of not integerating new commits because a commit broke us. Instead, we locally revert it in our fork and keep iterating while someone works on unreverting the local revert we did. If we rely on upstream builds for this, this is never going to work. Also, I'm very scared of how we will handle version mistmatches this way.

If you treat lighthouse as yet another downstream, then it becomes scary indeed. But that's the wrong way of looking at it.

This project is to be as canon as the LLVM test-suite in a first instance, and as clang in the long term.

My understanding of lighthouse is as a compiler, if we build a compiler we build everything with it and ship it.

Not at all. Lighthouse is a framework for building and testing compilers, bundled with a naive compiler plugin infrastructure as an example. If you want to build a new compiler, you can use that as a staring point. If you want to test your existing compiler, you can plug in using the same framework.

Does clang / rust support a shipped version where they ask you to download a specific version of llvm and then use clang with it? It just sounds like a version mismatch waiting to happen.

Clang is one compiler. LLVM also integrates with a huge number of other compilers and the LLVM test-suite can test all of them (providing you can compile the tests).

As expressed in the README, the first stage is to allow testing of different components and compilers, then we create some naive compiler example (in plugin form) while maintaining the ability of other compilers to co-exist.

Only if all of that works and we form as community around it (like clang did), that we begin talking about stronger structural guarantees (as clang does). But that's probably a decade away, so I would not worry about it now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants